Factorization tricks for LSTM networks

نویسندگان

  • Oleksii Kuchaiev
  • Boris Ginsburg
چکیده

We present two simple ways of reducing the number of parameters and accelerating the training of large Long Short-Term Memory (LSTM) networks: the first one is ”matrix factorization by design” of LSTM matrix into the product of two smaller matrices, and the second one is partitioning of LSTM matrix, its inputs and states into the independent groups. Both approaches allow us to train large LSTM networks significantly faster to the state-of the art perplexity. On the One Billion Word Benchmark we improve single model perplexity down to 24.29.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Collaborative Filtering with Recurrent Neural Networks

We show that collaborative filtering can be viewed as a sequence prediction problem, and that given this interpretation, recurrent neural networks offer very competitive approach. In particular we study how the long short-term memory (LSTM) can be applied to collaborative filtering, and how it compares to standard nearest neighbors and matrix factorization methods on movie recommendation. We sh...

متن کامل

Simulate Congestion Prediction in a Wireless Network Using the LSTM Deep Learning Model

Achieved wireless networks since its beginning the prevalent wide due to the increasing wireless devices represented by smart phones and laptop, and the proliferation of networks coincides with the high speed and ease of use of the Internet and enjoy the delivery of various data such as video clips and games. Here's the show the congestion problem arises and represent   aim of the research is t...

متن کامل

Neural Networks Compression for Language Modeling

In this paper, we consider several compression techniques for the language modeling problem based on recurrent neural networks (RNNs). It is known that conventional RNNs, e.g, LSTM-based networks in language modeling, are characterized with either high space complexity or substantial inference time. This problem is especially crucial for mobile applications, in which the constant interaction wi...

متن کامل

Graph-based Dependency Parsing with Bidirectional LSTM

In this paper, we propose a neural network model for graph-based dependency parsing which utilizes Bidirectional LSTM (BLSTM) to capture richer contextual information instead of using high-order factorization, and enable our model to use much fewer features than previous work. In addition, we propose an effective way to learn sentence segment embedding on sentence-level based on an extra forwar...

متن کامل

Adaptive Protection Based on Intelligent Distribution Networks with the Help of Network Factorization in the Presence of Distributed Generation Resources

Factorizing a system is one of the best ways to make a system intelligent. Factorizing the protection system, providing the right connecting agents, and transmitting the information faster and more reliably can improve the performance of a protection system and maintain system reliability against distributed generation resources. This study presents a new method for coordinating network protect...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1703.10722  شماره 

صفحات  -

تاریخ انتشار 2017